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Preface 

Read This First 



About This Manual 

This manual introduces the TMS320C62xx devices. The TMS320C6201 de- 
vice is the most powerful general-purpose programmable digital signal pro- 
cessor (DSP) available. The information in this manual describes the devices 
and provides a basic overview of how to use them. For more detailed informa- 
tion, see the related documentation. 

How to Use This Manual 

This document contains the following chapters: 

Chapter 1, Introduction, describes the main features of the TMS320C62xx 
devices, the history of Tl DSPs, and typical applications. 

Chapter 2, CPU Architecture, describes the architecture of the TMS320C62xx 
devices, with a block diagram and brief introduction to the parts of the device. 

Chapter 3, Memory, describes the on-chip memory and the external memory 
interface. 

Chapter 4 Peripherals, describes the peripherals available for the 'C62xx de- 
vices, such as ports, timers, direct-memory access, and power-down logic. 

Chapter 5, Development Support, describes the tools, documentation, Web 
site, and third-party support for the TMS320C6x. 
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Related Documentation From Texas Instruments 



Related Documentation From Texas Instruments 

The following books describe the TMS320C62xx devices and related support 
tools. To obtain a copy of any of these Tl documents, call the Texas Instru- 
ments Literature Response Center at (800) 477-8924. When ordering, please 
identify the book by its title and literature number. 

TMS320C6x Assembly Language Tools User's Guide (literature number 
SPRU186) describes the assembly language tools (assembler, linker, 
and other tools used to develop assembly language code), assembler 
directives, macros, common object file format, and symbolic debugging 
directives for the 'C6x generation of devices. 

TMS320C62xx CPU and Instruction Set Reference Guide (literature 
number SPRU189) describes the 'C62xx CPU architecture, instruction 
set, pipeline, and interrupts for the TMS320C62xx digital signal proces- 
sors. 

TMS320C6x C Source Debugger User's Guide (literature number 
SPRU188) tells you how to invoke the 'C6x simulator versions of the C 
source debugger interface. This book discusses various aspects of the 
debugger interface, including window management, command entry, 
code execution, data management, and breakpoints. 

TMS320 DSP Designer's Notebook: Volume 1 (literature number 
SPRT125) presents solutions to common design problems using 'C2x, 
'C3x, 'C4x, 'C5x, and other Tl DSPs. 

TMS320C6x Optimizing C Compiler User's Guide (literature number 
SPRU187) describes the 'C6x C compiler. This C compiler accepts ANSI 
standard C source code and produces assembly language source code 
for the 'C6x generation of devices. This book also describes the 
assembly optimizer, which helps you optimize your assembly code. 

TMS320C62xx Peripherals Reference Guide (literature number SPRU190) 
describes common peripherals available on the TMS320C62xx digital 
signal processors. This book includes information on the internal data 
and program memories, the external memory interface (EMIF), the host 
port, serial ports, direct memory access (DMA), clocking and phase- 
locked loop (PLL), and the power-down modes. 

TMS320C62xx Programmer's Guide (literature number SPRU198) 
describes ways to optimize C and assembly code and includes applica- 
tion program examples. 
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Related Documentation From Texas Instruments 



TMS320C6X Software Tools Getting Started Guide (literature number 
SPRU185) describes how to install the TMS320C6x assembly language 
tools, the C compiler, the simulator, and the C source debugger. Installa- 
tion instructions for SunOS™, Solaris™, Windows™ 95, and Windows 
NT™ systems are given. 

TMS320C6201 Digital Signal Processor Data Sheet (literature number 
SPRS051) describes the features of the TMS320C6xx and provides pin- 
outs, electrical specifications, and timings for the device. 
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// You Need Assistance 



If You Need Assistance . . . 



□ World-Wide Web Sites 

Tl Online 

Semiconductor Product Information Center (PIC) 

DSP Solutions 

320 Hotline On-line'" 



http://www.ti.com 

http://www.ti.com/sc/docs/pic/home.htm 

http ://www.ti .com/dsps 

http ://www.ti .com/sc/docs/dsps/support. htm 



□ North America, South America, Central America 

Product Information Center (PIC) (972) 644-5580 

Tl Literature Response Center U.S.A. (800) 477-8924 

Software Registration/Upgrades (214) 638-0333 

U.S.A. Factory Repair/Hardware Upgrades (281 ) 274-2285 

U.S. Technical Training Organization (972) 644-5580 

DSP Hotline (281)274-2320 

DSP Modem BBS (281 ) 274-2323 



Fax: (214)638-7742 



Fax: (281)274-2324 Email: dsph@ti.com 



DSP Internet BBS via anonymous ftp to ftp://ftp.ti.com/mirrors/tms320bbs 



□ Europe, Middle East, Africa 

European Product Information Center (EPIC) Hotlines: 

Multi-Language Support 

Deutsch +49 8161 80 33 11 

English 

Francais 

Italiano 
EPIC Modem BBS 
European Factory Repair 
Europe Customer Training Helpline 



+33 1 30 70 11 69 
or +33 1 30 70 11 68 
+33 1 30 70 11 65 
+33 1 30 70 11 64 
+33 1 30 70 11 67 
+33 1 30 70 11 99 
+33 4 93 22 25 40 



Fax: +33 1 30 70 10 32 Email: epic@ti.com 



Fax: +49 81 61 80 40 10 



□ 



Asia-Pacific 

Literature Response Center 
Hong Kong DSP Hotline 
Korea DSP Hotline 
Korea DSP Modem BBS 
Singapore DSP Hotline 
Taiwan DSP Hotline 
Taiwan DSP Modem BBS 



+852 2 956 7288 
+852 2 956 7268 
+82 2 551 2804 
+82 2 551 2914 

+886 2 377 1450 
+886 2 376 2592 



Fax 
Fax 
Fax 

Fax: 
Fax: 



+852 2 956 2200 
+852 2 956 1002 
+82 2 551 2828 

+65 390 7179 
+886 2 377 2718 



Taiwan DSP Internet BBS via anonymous ftp to ftp://dsp.ee.tit.edu.tw/pub/TI/ 



□ Japan 

Product Information Center +0120-81-0026 (in Japan) 

+03-3457-0972 or (INTL) 813-3457-0972 
DSP Hotline +03-3769-8735 or (INTL) 81 3-3769-8735 

DSP BBS via Nifty-Serve Type "Go TIASP" 



Fax: +0120-81-0036 (in Japan) 

Fax: +03-3457-1259 or (INTL) 813-3457-1259 

Fax: +03-3457-7071 or (INTL) 813-3457-7071 



□ Documentation 

When making suggestions or reporting errors in documentation, please include the following information that is on the title 
page: the full title of the book, the publication date, and the literature number. 

Mail: Texas Instruments Incorporated Email: comments@books.sc.ti.com 

Technical Documentation Services, MS 702 

P.O. Box 1443 

Houston, Texas 77251-1443 



Note: When calling a Literature Response Center to order documentation, please specify the literature number of the 
book. 
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Trademarks 



Trademarks 

Classico, MicroLite, and Virtuoso Nano are trademarks of Eonic Systems, Inc. 
EVP is a trademark of D2 Technologies. 
InvisiLink is a trademark of ViaDSP, Inc. 

PC is a trademark of International Business Machines Corporation. 

Solaris, SunOS, and Sun-3 are trademarks of Sun Microsystems, Inc. 

Tl, cDSP, VelociTI, and XDS510 are trademarks of Texas Instruments Incorporated. 

Windows, Windows 95, and Windows NT are registered trademarks of Microsoft Corporation 
(Windows™, Windows™ 95, Windows NT™). 
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Chapter 1 



Introduction 



The TMS320C62xx devices feature VelociTI™, an advanced very long instruc- 
tion word (VLIW) architecture developed by Texas Instruments. VelociTI, to- 
gether with the development tool set and evaluation tools, provides faster de- 
velopment time and higher performance with increased instruction-level paral- 
lelism. 




1-1 



Introduction to the TMS320C6x Generation 



1 .1 Introduction to the TMS320C6x Generation 

With performance of up to 1600 million instructions per second (MIPS) and a 
complete set of development tools, the 'C62xx devices offer cost-effective 
solutions to high-performance DSP programming challenges. The 'C6x 
development tools include a new C compiler, an Assembly optimizer that 
simplifies programming and scheduling, and a Windows-based debugger 
interface. VelociTI combines advanced VLIW architecture with a high degree 
of parallelism to produce a device that enables applications such as: 

□ Unlimited Internet bandwidth 

□ Universal wireless communications 

□ Radical new telephony features 

□ Remote medical diagnostics 

□ Ultimate automated cruise control 

□ Personal home base station 

□ Personalized home security 

The 'C62xx devices also can be used for improved performance on existing 
applications, such as: 

□ Wireless base stations 

□ Pooled modems and remote access servers 

□ Next-generation xDSL modems and cable modems 

□ Multichannel telephony platforms including central office switches, PBXs, 
and voice-messaging systems 

□ Multimedia systems 
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1 .2 The TMS320 Family of DSPs 

The TMS320 family consists of both 16-bit fixed-point and 32-bit floating-point 
devices. These DSPs possess the operational flexibility of high-speed 
controllers and the numerical capability of array processors. The following 
characteristics make this family the ideal choice for a wide range of processing 
applications: 

□ Very flexible instruction set 

□ Inherent operational flexibility 

□ High-speed performance 

□ Innovative, parallel architectural design 

□ Cost-effectiveness 

1 .2.1 History, Development, and Advantages of TMS320 DSPs 

In 1982, Texas Instruments introduced the TMS32010 — the first fixed-point 
DSP in the TMS320 family. Before the end of the year, the Electronic Products 
magazine awarded the TMS32010 the title "Product of the Year". The 
TMS32010 became the model for future TMS320 generations. 

Today, the TMS320 family consists of nine generations: the 'C1x, 'C2x, 'C2xx, 
'C5x, and 'C54x are fixed-point, the 'C3x and 'C4x are floating-point, the 'C8x 
is a multiprocessor, and the 'C6x will offer both fixed-point and floating-point 
devices. The first device in the 'C6x generation is the TMS320C6201 , which 
is a fixed-point DSP. 

Each generation of TMS320 devices has a central processing unit (CPU) and 
a variety of on-chip memory and peripheral configurations. These spin-off de- 
vices satisfy a wide range of needs in the worldwide electronics market. When 
memory and peripherals are integrated into one processor, the overall system 
cost is greatly reduced, and circuit board space is saved. Figure 1-1 shows 
the progress of the TMS320 family of devices. 
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Figure 1-1. The TMS320 Family of DSPs 




*■ 



1.2.2 Typical Applications 

The TMS320 family of DSPs offers better, more adaptable approaches to tradi- 
tional signal-processing problems, such as vocoding, filtering, and error cod- 
ing. Furthermore, the TMS320 family supports complex applications that often 
require multiple operations to be performed simultaneously. Table 1-1 shows 
many of the typical applications of the TMS320 family. 
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Table 1-1. Typical Applications for the TMS320 Family 



Automotive 


Consumer 


Control 
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Antiskid brakes 


Educational toys 


Engine control 


Cellular telephones 


Music synthesizers 


Laser printer control 


Digital radios 


Pagers 


Motor control 


Engine control 


Power tools 


Robotics control 


Global positioning 


Radar detectors 


Servo control 


Navigation 


Solid-state answering machines 




Vibration analysis 






Voice commands 






General-Purpose 


Graphics/Imaging 


Industrial 



Adaptive filtering 


3-D rotation 


Numeric control 


Convolution 


Animation/digital maps 


Power-line monitoring 


Correlation 


Homomorphic processing 


Robotics 


Digital filtering 


Image compression/transmission 


Security access 


Fast Fourier transforms 


Image enhancement 




Hilbert transforms 


Pattern recognition 




Waveform generation 


Robot vision 




Windowing 


Workstations 





Instrumentation 


Medical 


Military 


Digital filtering 


Diagnostic equipment 


Image processing 


Function generation 


Fetal monitoring 


Missile guidance 


Pattern matching 


Hearing aids 


Navigation 


Phase-locked loops 


Patient monitoring 


Radar processing 


Seismic processing 


Prosthetics 


Radio frequency modems 


Spectrum analysis 


Ultrasound equipment 


Secure communications 


Transient analysis 




Sonar processing 




Telecommunications 


Voice/Speech 



1200- to 56 600-bps modems 


Faxing 


Speaker verification 


Adaptive equalizers 


Future Terminals 


Speech enhancement 


ADPCM transcoders 


Line repeaters 


Speech recognition 


Base Stations 


Personal communications 


Speech synthesis 


Cellular telephones 


systems (PCS) 


Speech vocoding 


Channel multiplexing 


Personal digital assistants (PDA) 


Text-to-speech 


Data encryption 


Speaker phones 


Voice mail 


Digital PBXs 


Spread spectrum communications 




Digital speech interpolation (DSI) 


xDSL 




DTMF encoding/decoding 


Video conferencing 




Echo cancellation 


X.25 packet switching 
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1 .3 Key Features of the TMS320C62xx Devices 

The TMS320C62xx devices are fixed-point processors based on the ad- 
vanced VLIW CPU with eight functional units, including two multipliers and six 
arithmetic logic units. The CPU can execute up to eight instructions per cycle 
for up to ten times the performance of typical DSPs. The advanced VLIW archi- 
tecture allows designers to develop highly effective reduced instruction-set 
computer (RlSC)-like code for fast development time. Features common to 
all the devices in the 'C62xx series are listed in Table 1-2. 



Table 1-2. Key Features of the TMS320C62xx Devices 



Feature 



Benefit 



Advanced VLIW CPU with eight func- 
tional units including two multipliers and 
six arithmetic logic units 



Instruction packing 



1 00% conditional instructions 

Code executes as programmed on 
highly independent functional units 

8-/1 6-/32-bit data support 
40-bit arithmetic options 

Saturation and normalization 

Bit-field manipulation and instruction: 
extract, set, clear, bit counting 



Executes up to eight instructions per cycle for up to ten times the 
performance of typical DSPs. 

Allows designers to develop highly effective reduced instruction- 
set computer (RlSC)-like code for fast development time 

Code size equivalence for eight instructions executed serially or in 
parallel. 

Reduces code size, program fetches, and power consumption 
Reduces costly branching 

Increases parallelism for higher sustained performance. 

The most efficient C compiler in the industry on DSP benchmark 
suite and industry's first assembly optimizer for fast development 
time 

Efficient memory support for a variety of applications 

Extra precision for vocoders and other computationally intensive 
applications 

Support for key arithmetic operations 

Supports common operations found in control and data manipula- 
tion applications 



The first device in the family is the TMS320C6201 . The early release of this 
device includes memory, the external memory interface (EMIF), direct 
memory access (DMA) with two channels, the host-port interface (HPI), and 
a flexible phase-locked loop (PLL) clock generator. The production version of 
this device also will have two enhanced-buffered serial ports and two 32-bit 
timers. 

Table 1-3 summarizes the key features of the TMS320C6201 device. 
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Table 1-3. Features of the TMS320C6201 



ream re 
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Bit-field manipulation and instruction: 


Supports common operation found in control and data manipula- 


extract, set, clear, bit counting 


tion applications 


1 M-hit on-rhin mpmorv ('SI ?K-hit nro- 


Fast alaorithm execution with fewer comDonents Der svstem 

1 C< O L UIUV/I III II 1 1 /\ n-/ 1 1 V-/ II Willi Iv it 1 1 1 IL/wl lul 1 lu IJXsl vj v I w III 


gram, 512K-bit data) 




3?-hit pytprnal mpmorv intprfarp ^un- 


Hiah sDeed connections to external memorv for maximum sus- 

1 1 1 y V 1 OUvuU V>W III Ivv 11 vl lu IV*/ uAlV#l 1 IUI 1 1 IV/I II w 1 V 1 1 1 r * LA/\ 1 1 111*11 II cuo 


ports synchronous dynamic random ac- 
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burst static RAM (SBSRAM), and static 




RAM (SRAM) 
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(EBSPs) 


Provides high-speed interprocessor communication 


1 6-bit host access port 


Host processor access to on-chip data memory 


Two data memory access channels 


Efficient access to external memory/peripherals while minimizing 


with boot loading capability 


CPU interrupts 


Flexible PLL clock generator 


Multiplies external clock rate for two or four for maximum CPU 




performance 


352-lead ball grid array package 


Ultra-thin package minimizes board space 
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Chapter 2 



CPU 
Architecture 





The VelociTI architecture makes the 'C62xx the first off-the-shelf DSP to use 
advanced VLIW to achieve high performance through increased instruction- 
level parallelism. A traditional VLIW architecture consists of multiple execution 
units running in parallel that perform multiple instructions during a single clock 
cycle. Parallelism is the key to extremely high performance, taking these next- 
generation DSPs well beyond the performance capabilities of traditional 
superscalar designs. VelociTI is a highly deterministic architecture, with few 
restrictions on how or when instructions are fetched, executed, or stored. This 
architectural flexibility is key to the break-through efficiency levels of the 'C6x 
compiler. 
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2.1 TMS320C62xx Block Diagram 

The C62xx processor consists of three main parts - CPU (or the "core"), 
peripherals, and memory. The first device in the series, the TMS320C6201 , 
is a fixed-point DSP using the VelociTI VLIW architecture. Eight functional 
units operate in parallel, with two identical sets of the basic four functional 
units. The units communicate through two register files, which each contain 
16 32-bit registers. Program parallelism is defined at compile time since there 
is no data dependency checking done in hardware during run time. The 
256-bit-wide program memory fetches eight 32-bit instructions every single 
cycle. 

Figure 2-1 shows the block diagram for the TMS320C6201 digital signal pro- 
cessor (DSP). 'C62xx DSPs are based on the 'C62xx CPU. 'C62xx devices 
come with program memory which on some devices can be used as a program 
cache. The devices also have varying sizes of data memory. Peripherals such 
as a DMA controller, power-down logic, and EMIF usually come with the CPU, 
and peripherals such as serial ports and timers are available on certain de- 
vices. Check the data sheet for your device to determine the specific peripheral 
configurations you have. 



Figure 2-1. TMS320C6201 CPU Core With Peripherals 
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2.2 Central Processing Unit (CPU) 

The 'C62xx central processing unit (CPU) is the central building block of all the 
TMS320C62xx devices. The CPU contains: 

□ Program fetch unit 

□ Instruction dispatch unit 

□ Instruction decode unit 

□ 32 registers 

□ Two data paths, each with four functional units 

□ Control registers 

□ Control logic 

□ Test, emulation, and interrupt logic 

The CPU has two data paths where processing occurs. Each data path has 
four functional units (.L, .S, .M, .D) and a register file containing 16 32-bit 
registers. The functional units execute logic, shifting, multiply, and data 
address operations. All instructions operate on the registers. The two sets of 
data-addressing units (.D1 and .D2) are exclusively responsible for all data 
transfers between the register files and the memory. 

The four functional units on each side of the CPU share the control register 
files. Each side also has a single data bus connected to registers on the other 
side of the CPU so that the units can cross-exchange data from register files 
on opposite sides. Register access across the CPU supports only one read 
and write operation per cycle. 

The two sets of functional units include the following: 

□ Two multipliers 

□ Six arithmetic logic units (ALUs) 

□ 32 registers with 32-bit word length each 

Each functional unit is controlled by a 32-bit instruction. The instruction fetch, 
instruction dispatch, and instruction decode blocks can deliver up to eight 
32-bit instructions from the program memory to the functional units every 
cycle. The control register file provides methods to configure and control vari- 
ous aspects of processor operation. 

The VLIW processing flow begins when a 256-bit-wide instruction fetch packet 
is fetched from the internal program memory. The instructions are linked to- 
gether by the least significant bit (LSB) positions of the instruction. The instruc- 
tions linked together for simultaneous execution (up to eight in total) comprise 
an execute packet. For more details on the processing, see the 
TMS320C6201 Digital Signal Processor data sheet (literature number 
SPRS051). 
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The program fetch, instruction dispatch, and instruction decode units can de- 
liver up to eight 32-bit instructions from the program memory to the functional 
units every cycle. Processing occurs in each of the two data paths (A and B). 
Each data path has four functional units (.L, .S, .M, and .D) and a register file 
containing 16 32-bit registers. Each functional unit is controlled by a 32-bit 
instruction. To understand how instructions are fetched, dispatched, decoded, 
and executed in the data path, refer to the chapter on pipeline operation in the 
TMS320C62xx CPU and Instruction Set Reference Guide (literature number 
SPRU189). 

2.3 CPU Data Paths 

Figure 2-2 shows the 'C62xx CPU data paths, which consists of: 

□ Two general purpose register files (A and B) 

□ Eight functional units (.L1 , .L2, .S1 , .S2, .M1 , .M2, .D1 , and .D2, ) 

□ Two load-from-memory paths (LD1 and LD2) 

□ Two store-to-memory paths (ST1 and ST2) 

□ Two register file cross paths (1 X and 2X) 

2.3.1 General-Purpose Register Files 

There are two general-purpose register files (A and B) in the 'C62xx data 
paths. Each of these files contains 1 6 32-bit registers (labeled A0-A1 5 for file 
A and B0-B15 for file B). The general purpose registers can be used for data 
or data-address pointers. Registers A1 , A2, BO, B1 , and B2 can be used for 
condition registers. 

2.3.2 Functional Units 

The eight functional units in the 'C62xx data paths can be divided into two 
groups of four, each of which is virtually identical for each register file. The 
functional units are described in Table 2-1. 

Most data lines in the CPU support 32-bit operands, and some support long 
(40-bit) operands. Each functional unit has its own 32-bit write port into a gen- 
eral-purpose register file. All units ending in 1 (for example, .L1) write to regis- 
ter file A and all units ending in 2 write to register file B. Each functional unit 
has two 32-bit read ports for source operands srd and src2. Four units (.L1 , 
.L2, .S1, .S2) have an extra 8-bit wide port for 40-bit long writes as well as an 
8-bit input for 40-bit long reads. Because each unit has its own 32-bit write port, 
all eight units can be used in parallel every cycle. 
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Table 2-1. Functional Units and Descriptions 



Functional Unit 


Description 


L Unit ( L1 \J>) 


3?/40-hit arithmptin and romnarp onprations 




1 pft mncit 1 hit muntinn fnr r^? hits 




Normalization count for 32 and 40 bits 




32-bit logical operations 


.S Unit (.S1, -S2) 


32-bit arithmetic operations 




32/40-bit shifts and 32-bit bit-field operations 




32-bit logical operations, 




Branching 




Constant generation 




Register transfers to/from the control register file 


.M Unit .M2) 


16 x 16-bit multiplies 


.DUnit(.D1, .D2) 


32-bit add, subtract, linear and circular address calcula- 




tion 



2.3.3 Register File Cross Paths 

Each general-purpose register file is connected to the opposite register file's 
functional units by the 1X and 2X paths. These paths allow the .S, .M, and, .L 
units from each side to access operands from either file. 

Four units (.M1 , .M2, .S1 , .S2), have one 32-bit input mux selectable with either 
the same side register file (A for units ending in a 1 and B for units ending in 
a 2), or the opposite file via the cross paths (1X and 2X). The 32-bit inputs on 
the .L1 and .L2 units are both mux selectable via the cross paths. 

2.3.4 Memory, Load, and Store Paths 

There are two 32-bit paths for loading data from memory to the register file: 
one (LD1) for register file A, and one (LD2) for register file B. There are also 
two 32-bit paths, ST1 and ST2, for storing register values to memory from each 
register file. The store paths are shared with the .L and .S long read paths. 

2.3.5 Data-Address Paths 

The data-address paths (DA1 and DA2) coming out of the .D units allow data 
addresses generated from one register file to support loads and stores to 
memory from the other register file. 
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Figure 2-2. TMS320C62xx CPU Data Paths 
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2.4 Mapping Between Instructions and Functional Units 

Table 2-2 and Table 2-3 define the mapping between instructions and func- 
tional units. The first table lists the instructions that can be used on each func- 
tional unit. The second table lists the instructions alphabetically with the func- 
tional unit where each instruction can be used checked under the units. 
Table 2-2. Instruction to Functional Unit Mapping 



.L Unit 


.M Unit 


.S Unit 


.D Unit 




ABS 


MPY 


ADD 


ADD 


ADD 


SMPY 


ADDK 


ADDA 


AND 




ADD2 


LD mem 


CMPEQ 




AND 


LD mem (1 5-bit offset)* 


CMPGT 




B disp 


MV 


CMPGTU 




B IRPt 


NEG 


CMPLT 




B NRPt 


ST mem 


CMPLTU 




B req 


ST mem (15-bit offset)* 


LMBD 




CLR 


SUB 


MV 




EXT 


SUBA 


NEG 




EXTU 


ZERO 


NORM 




MVCt 




NOT 




MV 




OR 




MVK 




SADD 




MVKH 




SAT 




NEG 




SSUB 




NOT 




SUB 




OR 




SUBC 




SET 




XOR 




SHL 




ZERO 




SHR 

SHRU 

SSHL 

STPt 

SUB 

SUB2 

XOR 

ZERO 





t .S2 only 
J.D2 only 
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Table 2-3. Functional Unit to Instruction Mapping 



C62xx Functional Units 



Instruction .L Unit 


.M Unit 


.S Unit 




.D Unit 












ADD f 




IS 




IS 




ADDK 




is 








and y 




V* 
















B IRP 




lS\ 






B NRP 




y"t 






B reg 




is\ 






CLR 




is 






CMPEQ )S 




CMPGTU )S 


CMPLT IS 


CMPLTU IS 




EXTU 




is 
















LD mem 








IS 


LD mem 
(15-bit offset) 








ist 


LMBD \S 




















MVCt 




is 






MV y 




\s 






MVK 




is 







t .S2 only 
t.D2 only 
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Table 2-3. Functional Unit to Instruction Mapping (Continued) 



C62xx Functional Units 



Instruction 


.L Unit 


.M Unit 


.S Unit 


.D Unit 


MVKH 






iS 




NEG 




















NORM 










NOT 


V* 




V 




OR 






V 




SADD 


y 






HHHHI 


SAT 











SET 

SHL 

SHR 

SHRU 

SMPY 

SSHL 

SSUB 

ST mem 

ST mem (15- 
bit offset) 

STP 

SUB 

SUBA 



HHHI 



HHHHlHBBIBHHi 



HEHHHi 



nannH 



HHHHBHB9 



SUB2 
SWI 
XOR 
ZERO 



t.S2 only 
*.D2 only 
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2.5 Addressing Modes 

The addressing mode options on the C62xx are linear, circular using BKO, and 
circular using BK1 . The mode is specified by the addressing-mode register 
(AMR). 

Eight registers can perform circular addressing. A4-A7 are used by the .D1 unit 
and B4-B7 are used by the .D2 unit. No other units can perform circular addres- 
sing modes. For each of these registers, the AMR specifies the addressing 
mode. 

LD(B)(H)(W), ST(B)(H)(W), ADDA(B)(H)(W), and SUBA(B)(H)(W) instruc- 
tions all use the AMR to determine what type of address calculations are per- 
formed for these registers. All registers can perform linear mode addressing. 

For more information on addressing modes, see the TMS320C62xx CPU and 
Instruction Set Reference Guide (literature number SPRU189). 



2.6 Interrupts 

The 'C6200 CPU has 14 interrupts. These are reset, the non-maskable inter- 
rupt (NMI), and interrupts 4-15. These interrupts correspond to the RESET, 
NMI, and INT4-INT15 signals on the CPU boundary. In some 'C62xx devices 
these signals may be tied directly to pins on the device, connected to on-chip 
peripherals, or may be disabled permanently by being tied inactive on-chip. 
Generally, RESET and NMI are connected directly to pins on the device. 

For more information on interrupts, see the TMS320C62xx CPU and Instruc- 
tion Set Reference Guide (literature number SPRU189). 
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Memory 



The TMS320C62xx devices come with on-chip memory that can be selected 
for use as program memory or program cache. The device is available with 
varying sizes of data memory. When off-chip memory is used, the external 
memory interface (EMIF) can unify these spaces to a single memory space on 
most devices. 



Topic Page 
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3.1 Memory Map 

Figure 3-1 shows the memory map of the TMS320C6201 DSP. The total 
memory address range of the 'C6201 is 4M bytes (corresponding to 32-bit 
internal address representation). The memory map is divided between the 
internal-program memory, internal-data memory, three external memory 
spaces, and internal-peripheral space. 
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Figure 3-1. Memory Map of the TMS320C6201 
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3.2 Internal Memory 

The internal (on-chip) memory is organized into separate data and program 
spaces. The 'C62xx has two internal ports to access data memory, each with 
a 32-bit data and 32-bit byte-address reach. It has a single port to program 
memory, with an instruction fetch width of 256 bits and a 30-bit word (four byte) 
address, equivalent to a 32-bit byte address. 

3.2.1 Data-Memory System 

The TMS320C6201 data-memory system includes a 64K-bytes of SRAM and 
a memory controller. The TMS320C6201 CPU can access data memory in 
8-bit byte, 16-bit halfword, and 32-bit word-lengths. The data memory system 
supports two memory accesses in a cycle. These accesses can be any com- 
bination of loads and stores from the two data buses of the CPU. Similarly, a 
simultaneous internal and external memory access is supported by the data 
memory system. The TMS320C6201 data memory system also supports di- 
rect memory access (DMA) and external host accesses. For more information 
on the DMA operation, see the TMS320C62x CPU and Reference Guide. 

The data memory is organized into four banks of 16-bit wide memory. This in- 
terleaved memory organization provides a method for two simultaneous 
memory accesses. Occurring in one cycle, two simultaneous accesses to two 
different internal memory banks provide the fastest access speed. 

3.2.2 Program-Memory System 

The TMS320C6201 program-memory system includes 64K bytes of on-chip 
SRAM and a memory/cache controller. The program memory can operate as 
either a 64K-byte internal program memory or as a directly mapped program 
cache. There are four modes under which the TMS320C6201 program 
memory system operates: 

□ Program-memory mode 

□ Cache-enable mode 

□ Cache-freeze mode 

□ Cache-bypass mode 

The DMA can write data into an addressed space of program memory. The 
DMA cannot read from the internal program memory in program memory 
mode. 

When the program memory is used to cache external program data, the 
memory is no longer in valid memory space and cannot be directly addressed; 
therefore, the DMA cannot write or read the internal program memory in any 
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cache mode. The caching scheme implemented in the TMS320C6201 pro- 
gram cache is a direct mapping of external program memory addresses. This 
means that any external address map to only one cache location, and ad- 
dresses which are 64K bytes apart map to the same cache location. The pro- 
gram cache is organized into 256-bit frames. Thus, each frame holds one fetch 
packet. The cache stores 2048 fetch packets. 

A program store to external memory in any cache mode first flushes the data 
in the cache frame that is mapped to the target address directly to ensure data 
coherency in the cache. The data then is written to the external memory at the 
addressed location. When that address is again accessed a cache miss oc- 
curs causing the new data to be loaded from external memory. 

On the change from program memory mode to cache-enabled mode, the pro- 
gram cache is flushed. During a cache freeze, the cache retains its current 
state. A program read to a frozen cache is identical to a read to an enabled 
cache with the exception that the data read from the external interface is not 
stored in the cache on a cache miss. When the cache is bypassed, any pro- 
gram read fetches data from external memory. The data is not stored in the 
cache memory. Like cache freeze, in cache bypass the cache retains its state. 
For details on cache modes, see the TMS320C62xx Peripherals Reference 
Guide 
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3.3 External Memory Interface (EMIF) 

All external data accesses by the CPU or DMA pass through the external 
memory interface (EMIF). The EMIF is the interface between the CPU and ex- 
ternal memory such as synchronous dynamic random-access memory 
(SDRAM), synchronous-burst static RAM (SBSRAM), and asynchronous 
memory. The EMIF also provides 8-bit and 16-bit wide memory read capability 
to support low-cost boot ROM memories (flash, EEPROM, EPROM, and 
PROM). The production version of the EMIF will support higher throughput in- 
terfaces to SDRAM, including burst capability. 

The interface is programmable to adapt to a variety of setup, hold, and strobe 
widths for asynchronous devices. SBSRAM supports zero-wait state external 
access once bursts have begun. 

In all of these types of access, the EMIF supports 8-bit, 16-bit, and 32-bit ad- 
dressability for writes. All reads are performed as 32-bit transfers. 

The EMIF can receive three types of requests for access. The three types are 
prioritized in this order: CPU data accesses, CPU program fetches, and DMA 
data accesses. When available to service another access, the EMIF services 
the request type of highest priority. For example, DMA requests are not serv- 
iced until the CPU ceases requesting external data and program fetches. 

The major functions implemented by the EMIF are the following: 

□ Steering incoming data bytes and half-words to form word data (when 
reading from byte and half-word memories) 

□ Interfacing to the 'C6xx internal peripheral bus 

□ Interfacing to SBSRAM and SRAM, including computation of the next ad- 
dress 

□ Interfacing to asynchronous memories, using programmable timing 

□ Interfacing to SDRAM memories 

□ Handshaking with internal and external modules 

The characteristics of the EMIF are as follow: 

□ Zero wait state operation, after an initial two clock cycle latency, with syn- 
chronous burst SRAM (currently available in 125 MHz speed grades). 

□ Support for little or no glue logic interface to asynchronous memory. Tim- 
ing parameters can be programmed to match various asynchronous me- 
mories. 
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□ Serialization between program memory system, data memory system, 
and DMA system. 

□ Support for sharing external memory with another processor. 

□ Support for reading byte and half-word memory devices (ROM) to facili- 
tate boot up from these low cost devices. 

The exact level of throughput to the 'C62xx is determined by the type of 
memory used, and the clock rate of the 'C62xx. For example, when the 'C62xx 
is running at 200 MHz, the maximum throughput would be 800 M-byte/second 
- assuming memory supporting that throughput. 

Figure 3-2 shows a diagram of the 'C62xx external memory signals that are 
common to all interfaces. 



Figure 3-2. External Memory Interface (EMIF) Block Diagram 
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For more information on memory, see the TMS320C62xx CPU and Instruction 
Set Reference Guide (literature number SPRU1 89). For more information on 
the EMIF, see the TMS320C62xx Peripherals Reference Guide (literature 
number SPRU190). 
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Peripherals 



In addition to on-chip memory, the TMS320C62xx devices also contain several 
peripherals for communication with off-chip memory, co-processors, and seri- 
al devices. 
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Peripherals for the TMS320C62xx devices may include: 

□ External memory interface (EMIF) 

□ Direct-memory access (DMA) controller 

□ Host-port interface (HPI) 

□ Power-down logic 

□ Enhanced-buffered serial ports (EBSPs) 

□ 32-bit timers 

The first device in the TMS320C62xx series is the TMS320C6201 . The pro- 
duction release of this device will include the EBSPs supporting multivendor 
interface protocol (MVIP) and timers to allow easy algorithm integration. The 
EBSP is based on the standard TMS320C2x/C5x/C54x serial port. In addition, 
it has the ability to buffer serial samples in memory automatically with the aid 
of the DMA. It also has multichannel capability compatible with the T1 , E1 , and 
MVIP standards. 

Figure 4-1 shows the peripherals for the TMS320C62xx devices. 
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Figure 4-1. Peripherals Overview 
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4.1 External Memory Interface (EMIF) 

The EMIF is the interface between the CPU and external memory such as syn- 
chronous dynamic random-access memory (SDRAM), synchronous burst 
static RAM (SBSRAM), and asynchronous memory. The EMIF also provides 
8-bit and 16-bit wide memory read capability to support low-cost boot ROM 
memories (flash, EEPROM, EPROM, and PROM). The final revision of the 
EMIF will support higher throughput interfaces to SDRAM including burst 
capability. 

More information on the EMIF is provided in Chapter 3, Memory. 

For more information on memory, see the TMS320C62xx CPU and Instruction 
Set User's Guide. For more information on the EMIF, see the TMS320C62xx 
Peripherals Reference Guide. 
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4.2 Direct- 




The on-chip DMA offers two channels which can be configured to transfer in- 
formation from one location in the memory map to another without interfering 
with the operation of the CPU. This allows interfacing to slow external memo- 
ries and peripherals without reducing the throughput to the CPU. The DMA 
controller contains its own address generators, source and destination regis- 
ters, and transfer counter. The DMA has its own bus for addresses and data. 
This keeps the data transfers between memory and peripherals from conflict- 
ing with the CPU. A DMA operation consists of a 32-bit word transfer to or from 
any of the three 'C62xx modules (see Figure 4-2): 

□ Internal Data Memory 

□ Internal Program Memory that is not configured as cache as a destination 
of a transfer 



One of the channels can be used by the processor during the boot load startup 
procedure to initialize the internal program memory after reset. The DMA 
channels can be used to write to Internal program memory. 

The boot loader uses the DMA to boot load code from off-chip memory to the 
internal program memory area. An external pin (sampled at reset) selects 
whether this boot load is performed. The serial port can also be used for boot- 
ing. 

The DMA controller can access all internal program memory, all internal data 
memory, and all devices mapped to the EMIR An exception is that the DMA 
cannot use program memory as the source of a transfer. Also, it cannot access 
memories configured as cache or memory-mapped on-chip peripheral regis- 
ters. 

The DMA controller has the following features: 

□ Two independent channels 

□ Source and destination addresses may be within the same or different 
modules. These addresses are programmable independently, and can re- 
main constant, increment, or decrement on each transfer. 

□ The transfer count is programmable. Once the transfer count has com- 
pleted, the DMA can send an interrupt to the CPU. 



EMIF 



Peripherals 



4-5 



Direct-Memory Access (DMA) 



Figure 4-2. DMA Controller Interconnect to 'C62xx Memory Mapped Modules 
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The DMA has lowest priority to all modules it accesses and it must wait until 
no transfers are being initiated to the internal data and program memory it in- 
tends to access. DMA accesses to internal memory perform cycle stealing; 
therefore, no subsequent CPU accesses of internal memory are hampered by 
a DMA access. However, if the CPU accesses the EMIF while a multi-cycle 
DMA access is in progress, it must wait until that access completes. 

Each DMA channel has an independent set of registers that must be pro- 
grammed to control the operation of the DMA. 

See the data sheet for the specific device to find the memory mapping of DMA 
control registers. These registers are 2-bits wide and must be accessed 
through 32-bit accesses from the CPU. 
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4.3 Host-Port Interface (HPI) 

The host-port interface (HPI) operates as a straightforward asynchronous in- 
terface. The HPI is a 16-bit wide access port through which a host (external) 
processor can read from, and write to, the 'C62xx's internal data memory. 

A host processor access to the 'C62xx internal data memory through the HPI 
consists of two operations, which follow: 

1) The host must gain control over the HPI by performing the request/ac- 
knowledge handshake through the HREQ/HACK signals. 

2) Once access has been granted, the host may perform read and write op- 
erations to the 'C62xx internal data memory. 

The mapping of host-port address to the 'C62xx internal memory address is 
described in the data sheet for your 'C62xx device. 

The HPI on the production release of the TMS320C6201 will have the ability 
to boot load the CPU as well as access the full range of the 'C6201 's memory. 
Also, the HPI will offer improved performance and will be capable of operating 
without impacting CPU performance. 

For more information on the host port, see the TMS320C6xx Peripheral Refer- 
ence Guide (literature number SPRU190). 
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4.4 Power-Down Logic 

The 'C62xx supports three power-down modes that can reduce system power 
requirements significantly. The modes are as follow: 

□ Idlel 

□ Idle2 

□ Idle3 

The three lower bits of the power-down field in the control status register (CSR) 
are used to initiate the three power-down modes. If more than one of these 
power-down bits are set, the power-down mode is selected by the most signifi- 
cant bit enabled. 

When in a power-down mode, the 'C62xx can be reactivated by a reset, an en- 
abled interrupt, or any interrupt. Bits three and four of the power-down field in 
the control status register set the wake-up condition. 

The power-down mode bit and wake up bit must be set by the same instruction 
to ensure proper power-down operation. See the TMS320C6xx Peripheral 
Reference Guide (literature number SPRU190) for more details on the power- 
down logic 
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Development 
Support 



The TMS320C6x design environment reflects the unique nature of the ad- 
vanced VLIW architecture. The environment includes code-generation tools, 
evaluation tools, documentation, on-line help with various tools, and a Web 
site on the Internet (www.ti.com/sc/C6x) with complete technical documenta- 
tion. 
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5.1 Code-Generation Tools 

A complete development tool set for both the PC and Sun workstations in- 
cludes the following: 

□ C Compiler 

□ Assembly optimizer 

□ Linker 

The environment is founded on the generation's highly advanced C compiler 
and Tl's revolutionary assembly optimizer. Figure 5-1 shows a flow of the pro- 
cess to develop code. 

The 'C6x generation's C compiler eliminates the need for extensive knowledge 
of DSP architecture, allowing you to take full advantage of the world's most 
powerful DSP. This highly-structured, architecture-independent C code 
development environment dramatically reduces development time for new 
products. At the same time, it maintains the inherent performance benefits of 
the 'C6x generation's advanced VLIW architecture. The 'C6x compiler offers 
a 3X improvement in efficiency over existing fixed-point C compilers for DSP 



For application code sections that require the fine tuning of assembly code, the 
'C6x generation's unique assembly optimizer provides the same transparent 
programming capability as the C compiler. The tool supports automatic sched- 
uling, optimizing, and separation of fine-grained parallel tasks from serial, in- 
line assembly code - delivering a level of simplicity and power that is unprece- 
dented in assembly-level tools. 

The tools take C or assembly source code and implement many different opti- 
mizations, including software pipelining, to intelligently find and exploit the 
unique instruction-level parallelism of the 'C6x. After each step in the process, 
the 'C6x tools allow you to evaluate their results and take appropriate steps 
to achieve the most parallel code. 

Initially, all C code — new or reused from other applications — is run through 
the C compiler for the 'C6x. Using the evaluation tools described in the 
following section, you can evaluate the code for efficiency. If the performance 
is sufficient for the particular application, then the application has been 
completed, achieving the fastest possible time-to-market and incurring 
minimal engineering cost. 
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Figure 5-1. Code-Development Flow Chart 
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A designer who needs to improve code efficiency can use intrinsics, com- 
mand-line options and source-code enhancements as a first step: 

□ The 'C6x design tools feature two sets of intrinsics. The first set includes 
intrinsics that perform DSP-specific operations on instructions that are not 
supported directly in C. The second set is designed to facilitate 16-bit op- 
eration on a 32-bit machine. These intrinsic functions can be invoked to 
tune the performance of the C code. 

□ The designer can experiment with several command-line options that 
cause the compiler to perform more aggressive optimization. One particu- 
larly useful option instructs the compiler to compile an entire application 
at once, giving the compiler visibility across program sections and more 
knowledge of the way in which variable and functions are used. Another 
option causes the compiler to perform global optimizations across an en- 
tire application. 

□ Source-code enhancements can be made to exploit specific features of 
the 'C6x architecture. For example, the 'C6x has support for operating on 
words containing two 16-bit quantities; therefore, you can utilize 32-bit 
loads and stores when operating on arrays containing 1 6-bit data and eas- 
ily achieve a 2X performance improvement. 

Taken together, these actions extract a large amount of parallelism from C 
code. 

For ultra-high performance applications, extracting every last bit of throughput 
from the application code may be necessary. The profiler can identify critical 
code segments that might benefit most from being generated in assembly lan- 
guage. 

For these program sections, the designer writes simple, linear 'C6x assembly 
code that is input to Tl assembly optimizer. This assembly code is 'C6x instruc- 
tions written without concern for parallel instructions, instruction latencies, or 
register usage. 

The assembly optimizer tool schedules the instructions, taking into account 
the architectural parallelism. The tool honors 'C6x latency requirements, maxi- 
mizes parallel code, and performs register allocation. 



5-4 



Evaluation Tools 



5.2 Evaluation Tools 

The evaluation tools include the following: 

□ Windows-based debugger interface 

□ Simulator 

□ Hardware emulation board 

The 'C6x development environment provides a new intuitive Windows™- 
based graphical user interface (GUI) for debugging. The debugger interface 
features windows for source, assembly, call stack, memory, registers, and 
watch expressions as well as menu and tool bars. The debugger offers one- 
click breakpoint setting and dialogs for editing breakpoint. The debugger also 
incorporates the dynamic profiler to help users find bottlenecks and improve 
code efficiency. 

Tl will provide 'C6201 scan-based emulation systems that support hardware 
and software debugging of target systems via a JTAG-emulation cable. Scan- 
based emulation is a unique, non-intrusive approach to system emulation, in- 
tegration, and debugging. 

Initially, Tl is offering a stand-alone 'C6201 test and emulation board (TEB) that 
interfaces with the host platform through the XDS510™ and XDS 510WS emu- 
lators through the IEEE Standard 1149.1 (JTAG)-compliant port. The board 
features a prototyping area for adding user-defined peripherals. With the addi- 
tion of other 'C6x generation members, Tl will continually add functionality to 
the common development environment as well. Capabilities will ultimately in- 
clude a PC plug-in evaluation module (EVM) board, a low-cost PC-based 
board that is well-suited for software algorithm development. 

The dynamic profiler integrated into the 'C6x debugger creates cycle histo- 
grams that are continuously updated as the code runs. It can show graphically 
which functions, ranges and lines in an application are performance bottle- 
necks. The profiler can show: 

□ The percentage of total execution time spent in any function 

□ The number of times a function is called 

□ Total cycles in the application, a function, or a line 

A timing display can be built into the application by inserting a few function calls 
in the code. The resulting simple cycle counts, obtained without using the pro- 
filer or the debugger, can be printed automatically to allow you to track the 
changes in execution speed of an algorithm over time. This output, while less 
sophisticated, is continuously available with no further action. 
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5.3 Third-Party Support 



Tl has a long history of strong third-party support and this continues with the 
'C62xx devices. Table 5-1 lists some of the companies supporting the 'C62xx 
devices and the product areas. Table 5-2 lists the third-party support contacts 
with telephone numbers and electronic mail addresses. 

Table 5-1. Third-Party Support Companies and Product Area Supported 



Company 



Product Area 



Ariel Corporation 

Cheops GmbH & Co KG 

D2 Technologies, Inc. 

DSP Research, Inc. 

DSP Software Engineering, 
Inc. 

Eonic Systems, Inc. 
Go DSP Corporation 

HotHaus Technologies, Inc. 

Innovative Integration 

Loughborough Sound 
Images 

Pentek, Inc. 

Signals & Software Ltd. 
(SASL) 

ViaDSP, Inc. 

White Mountain DSP, Inc. 



High-performance VME64 platform and computer telephony products 
Industrial and medical imaging and high speed/high resolution videoconferencing 
Embedded Voice Processing (EVP™) computer telephony software 
TIGER development boards and OEM systems 
Multi-channel V.34bis soft-modem and telecom software 

Real-time operating systems - Virtuoso Nano™, Classico™, and MicroLite™ 

Code Composer™ support and next generation development tool, Code Mae- 
stro™ 

HausWare - DSP software architecture for embedded telecommunications ap- 
plications 

PCI6201 DSP coprocessor for telecom, communications and data acquisition ap- 
plications 

PCI/C6200 - signal processing platform and PCI/C6220 telecommunications/ 
high density DSP telephony platform 

Scaleable multi-processor board for the VMEbus - model 9134 
Very high density ISP modem solution 



InvisiLink™ - line of software and firmware for high density computer telephony 
boards 

Emulation and multiplatform debug support -Mountain-510, Mountain-510/WS 
and Mountain-510/LT PCMCIA Card 
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Table 5-2. Contacts for Third-Party Support 



Thirri-Psirtv Onntapt 
i i in u roity vui I laui 


Phnno nnmhpr 


c man auuicoo 


Ariel Corporation 


609 860-2900 


ariel@ariel.com 


Cheops GmbH & Co KG 


+49 8861 2369 


100541.3370@compuserve.com 


D2 Technologies, Inc. 


805 564-3424 


sales@d2tech.com 


DSP Research, Inc. 


408 773-1 042 


info@dspr.com 


DSP Software Engineering, Inc. 


617 275-3733 


info@dspse.com 


Eonic Systems, Inc. 


301-572-5000 


info@eonic.com 


GO DSP Corporation 


416 599-6868 


gdasilva@go-dsp.com 


HotHaus Technologies, Inc. 


604-278-4300 


info @ hothaus.com 


Innovative Integration, Inc. 


818-865-6150 


techsprt @ innovative-dsp.com 


Loughborough Sound Images 


+44 1509 634444 




Pentek, Inc 


201-818-5900 


info@pentek.com 


Signals & Software Ltd. (SASL) 


+44 181 426 9533 


davem@sasl.demon.co.uk 


ViaDSP, Inc. 


508-369-0048 


dpenny @ viadsp.com 


White Mountain DSP, Inc. 


603 883-2430 


info@wmdsp.com 
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5.4 Web Site and Documentation 

Visit the Web site at www.ti.com/sc/C6x for information, an interactive 
multimedia technical overview (MeTO), documentation, and schedule of 'C6x 
design workshops. The MeTO describes features of the devices in a visual 
way with graphics in a point-and-click display for ease of navigation. The Web 
site offers a complete training schedule of design workshops and seminars. 
Applications assistance and frequently asked questions (FAQ), are also on the 
Web site. 

Documentation is available directly from the Web site in down-loadable files 
for printing. There is a complete list of documentation available in the Preface 
under Related Documentation from Texas Instruments. 
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Glossary 




address: The number of a particular memory or peripheral storage location. 

ALU: Arithmetic logic unit. The high-speed CPU circuit that does calculating 
and comparing. Numbers are transferred from registers into the ALU for 
calculation, and the results are sent back to a register. 

ASIC: Application-specific integrated circut. A custom chip designed for a 
specific applicaiton. It is designed by integrating standard cells from a 
library. 

Assembler: A software program that creates a machine-language program 
from a source file that contains assembly language instructions, direc- 
teives, and macro definitions. The assembler substitutes absolute oper- 
taion dcodes for symbolic operation codes . 

Assembly Optimizer: A software program that optimizes linerar assembly 
code, which is assembly code that has not been register-allocated or 
scheduled. The assembly optimizer is automatically invoked with the 
shell program, C/6x, when one of the input files has a .sa extension. 



B 



boot loader: A built-in segment of code that transfers code from an external 
source to program memory at power up. 



c 



clock cycles: Cycles based on the input from the external clock. 

code: A set of instructions written to perform a task; a computer program or 
part of a program. 

CPU: Central processing unit. The unit that coordinates the functions of a 
processor. 
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Data Memory: Memory accessed through the 'C6x's RAM interface. 

DMA: Direct memory access. Specialized circuitry that transfers data from 
memory to memory without using the CPU. 

DRAM: Dynamic random access memory. The most common type of com- 
puter memory. 

EBSP: Enhanced buffered serial ports. 
E M I F : External memory interface. 

execute packet: A packet of instructions that execute in parallel, 
external interrupt: A hardware interrupt triggered by a pin. 

fetch packet: A packet containing up to eight instructions held in memory 
for execution by the CPU. 

global interrupt enable (GIE): A bit in the control status register (CSR). 
Used to enable or disable maskable interrupts. 

hardware interrupt: An interrupt triggered through physical connections 
with on-chip peripherals or external devices. 

HPI: Host port interface 



Glossary 



interrupt: A condition caused either by an event external to the CPU or by 
a previously executed instruction that forces the current program to be 
suspended and causes the processor to execute an interrupt service 
routine corresponding to the interrupt. 

interrupt service fetch packet (ISFP): A fetch packet used to service inter- 
rupts. If 8 instructions are insufficient, the user must branch out of this 
block for additional interrupt service. If the delay slots of the branch do 
not reside within the ISFP, execution continues from execute packets in 
the next fetch packet (the next ISFP). 

ISFP: Interrupt service fetch packet. 

IFP: Instruction fetch packet. 



latency: The delay between when a condition occurs and when the device 
reacts to the condition. Also, in a pipeline, the necessary delay between 
the execution of two instructions to ensure that the values used by the 
second instruction are correct. 

LSB: least significant bit. The lowest order bit in a word. 



maskable interrupt: A hardware interrupt that can be enabled or disabled 
through software. 

memory interleaving: A category of techniques for increasing memory 
speed. 

MIPS: Million instructions per second. The execution speed of a computer. 
MSB: most significant bit. The highest order bit in a word. 



NMI: Non-maskable interrupt 

nonmaskable interrupt: An interrupt that can be neither masked nor dis- 
abled. 
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overflow: A condition in which the result of an arithmetic operation exceeds 
the capacity of the register used to hold that result. 



pipeline: A method of executing instructions in an assembly-line fashion. 

pipeline processing: A category of techniques that provide simultaneous, 
or parallel, processing within the computer. It refers to overlapping 
operations by moving data or instructions into a conceptual pipe with all 
stages of the pipe processing simultaneously. 

PLL: Phase-locked loop. 

program memory: Memory accessed through the C6x's program fetch in- 
terface. 



RAM: Random-access memory. 

register: A group of bits used for temporarily holding data or for controlling 
or specifying the status of a device. 

reset: A means of bringing the CPU to a known state by setting the registers 
and control bits to predetermined values and signaling execution to start 
at a specified address. 

RISC: Reduced instruction set computing. A computer architecture that re- 
duces chip complexity by using simpler instructions. 



SBSR AM : Synchronous burst static random-access memory. 

SDRAM: Synchronous dynamic random-access memory. 

shifter: A hardware unit that shifts bits in a word to the left or to the right. 
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VelociTI: Architecture developed by Tl that features very long instruction 
words 

VLIW: Very long instruction word. 
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Addressing modes, 2-10 
Addressing-mode register, 2-10 
ALU. See arithmetic logic unit 
AMR. See addressing-mode register 
Applications 
'C6x, 1-2 

TMS320 family, 1-4 
Applications of DSPs, 1-4 
Architecture 

CPU, 2-1,2-3 

VelociTI, 1-2, 2-1 

VLIW, 2-1 
Arithmetic logic unit, 2-3 
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Block Diagram, 2-2 
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Central processing unit, 1-3 
Architecture, 2-3-2-6 
Core with peripherals, 2-2 
Data path figure, 2-6 
Data-address paths, 2-5 
Functional units, 2-3, 2-4 
Load and store paths, 2-5 
Memory paths, 2-5 
Register, 2-4 
Register files, 2-4 

Code Development Flow Chart, 5-3 



Code-generation tools, 5-2 
CPU. See central processing unit 
Cross paths, CPU, 2-5 
CSR. See control status register 
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Data path figure, 2-6 
Data paths, CPU, 2-4 
Development support, 5-2 
Development Tools 

C Compiler, 5-3 

Code Development Flow, 5-3 
Direct Memory Access, 4-5 
Direct memory access, 4-2 
DMA. See Direct memory access; direct memory 

access 
DMA controller, 4-5 
DSPs 

Applications, 1-4 

History, 1-3 
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EBSP See enhanced buffered serial ports 
EMIF, 3-1 

See also external memory interface 

Block diagram, 3-7 
Enhanced buffered serial ports, 1-6, 1-7, 4-2 
Evaluation tools, 5-5 
EVM. See evaluation module 
External memory, 3-6, 4-4 
External memory interface, 4-2 
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Features of the 'C62xx, 1-6 
Features of the TMS320C6201 , 1 -7 
Functional units, 2-3 

.D, 2-5 

.L, 2-5 

.M, 2-5 

.S, 2-5 

Cross paths, 2-5 
Descriptions, Table, 2-5 
Mapping, instructions, 2-7 
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Graphical user interface, 5-5 
GUI. See graphical user interface 



H 



Host-port interface, 2-2, 4-2, 4-7 
HPI. See host-port interface 




Idle modes, 4-8 

IFP. See instruction fetch packet 
Instruction fetch packet, 2-3 
Instruction set, 2-7 

Instruction to functional unit mapping, 2-8, 2-9 
Instructional to functional unit mapping, 2-7 
Internal memory, 3-4 
Interrupts, 2-10 

Introduction, TMS320 family overview, 1-3 




Least significant bit, 2-3 

Load and store paths, CPU, 2-5 

LSB. See least significant bit 




Mapping, Instruction to functional unit, 2-7, 2-8, 2-9 
Memory, 3-1 

EM IF, 3-6, 4-4 

External, 3-6 

Internal, 3-4 

Program, 3-4 
Memory map, 3-2, 3-3 
Million instructions per second, 1-2 
MIPS. See Million instructions per second 
Multivendor interface protocol (MVIP), 4-2 
MVIP See Multivendor interface protocol 
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NMI. See Non-maskable interrupt 
Non-maskable interrupt, 2-10 
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Overview, TMS320 family, 1-3 




Peripherals overview, 4-3 
Power-down logic, 4-8 
Program memory, 3-4 
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Reduced instruction set computer, 1-6, 1-7 

Register files, CPU, 2-4 

Register paths, Cross paths, 2-5 

RISC. See Reduced instruction set computer 
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SBSRAM. See synchronous burst static RAM 
SDRAM. See synchronous dynamic RAM 
Synchronous burst static RAM, 1-7, 3-6 
Synchronous dynamic RAM, 1-7 
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Third-party support, 5-6 

Contacts, 5-7 

Product area, 5-6 
TMS320, Introduction to the 'C6x, 1-2 
TMS320 DSPs, Applications table, 1-5 
TMS320 family 

Advantages, 1-3 

Applications, 1-4-1-5 

Characteristics, 1-3 

Development, 1-3 

History, 1-3 

Overview, 1-3 
TMS320C6201 , 1-2 

TMS320C6201 CPU Core With Peripherals, 2-2 
TMS320C62xx, Peripherals, 4-2 
Tool set, 5-2 
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Assembly optimizer, 5-2 
C Compiler, 5-2 
Debugger, 5-5 
Evaluation, 5-5 

Hardware emulation board, 5-5 
Linker, 5-2 
Simulator, 5-5 
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Using the C Compiler, 5-3 
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VelociTI, 1-2, 2-1 

Very-long instruction word, 1-2 

VLIW. See Very-long instruction word 
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